Clustering with Side Information for Mining Text Data

نویسنده

Vijayalakshmi P

چکیده

Side information is available along with text document in several text mining application. They are the different kind of side information such as document provenance information, the link in the document, other non textual attributes which are contained into the document or user access behavior from web logs. Some attributes may contain extremely large amount of information for clustering purpose. Sometimes clustering is more difficult when some of the information is noisy. To design a combination of classical partitioning algorithm with probabilistic model technique to create an effective clustering approach. Then the clustering approach will extend to classification approach for real data set which shows the advantages of previous result.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

A Review on Categorization of Text Data Using Side Information

In today’s digital environment, text databases are rapidly increases due to use of internet and communication mediums. Different text mining techniques are used for knowledge discovery and Information retrieval. Text data contains the side information along with the text data. Side information may be the metadata associated with text data like author, co-author or citation network, document pro...

متن کامل

An Approach to Semi-Supervised Co-Clustering With Side Information in Text Mining

Nowadays, in many text mining applications, eloquent quantity of information from document is present in the form of text. This text information contains various types of information such as side information or metadata. This side information is easily attainable in the text document. Such side information may of distinct types such as document provenance information, user access behaviour from...

متن کامل

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

As computer networks become the backbones of science and economy, enormous quantities documents become available. So, for extracting useful information from textual data, text mining techniques have been used. Text Mining has become an important research area that discoveries unknown information, facts or new hypotheses by automatically extracting information from different written documents. T...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Clustering with Side Information for Mining Text Data

نویسنده

چکیده

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

A Review on Categorization of Text Data Using Side Information

An Approach to Semi-Supervised Co-Clustering With Side Information in Text Mining

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

ارائه مدلی برای استخراج اطلاعات از مستندات متنی، مبتنی بر متن‌کاوی در حوزه یادگیری الکترونیکی

عنوان ژورنال:

اشتراک گذاری